TWENTY  QUESTIONS  AS  A  PREDICTOR  OP  TEST  SCORES 
OR  BOARD  SELECTIONS  POR  ADVANCEMENT 


By 

Andrew  N.  Dow 


A  DISSERTATION  PRESENTED  TO  THE  GRADUATE  COUNCIL  0 

THE  UNIVERSITY  OP  FLORIDA 
IN  PARTIAL  FULFILLMENT  OP  THE  REQUIREMENTS  POR  THE 
DEGREE  OP  DOCTOR  OF  EDUCATION 


UNIVERSITY  OP  FLORIDA 
1977 


A  CKNOWLEDGEMENTS 
While  the  completion  of  this  study  would  have  been  im- 
possible without  the  cooperation  and  assistance  of  many- 
people,  it  is  hardly  proper  to  name  some,  and  possibly  omit 
others,  who  have  been  especially  helpful.     A  number  of  my 
co-workers,  my  immediate  family,  certain  members  of  the  Uni- 
versity of  Florida  faculty,  and  the  career  petty  officers  of 
the  U.S.  Navy  have  provided  the  several  kinds  of  support 
needed  while  the  project  was  in  progress.     But,  as  I  feared, 
I  have  omitted  some--the  officer  who  is  designated  EA  at  the 
Naval  Education  and  TV  aim*  no-  p-nnp-nom  nnTrQi  n~.~* —  2., 
typist,  the  nonfaculty  employees  of  the  University  of  Florida 
(both  in  the  Graduate  School  and  the  College  of  Education), 
and,  last  but  hardly  least,  the  unknown  French  automotive 
engineers  who  designed  the  vehicles  that  made  my  commuting 
trips  comfortable,  safe,  and  conservative  of  fuel. 


ii 


TABLE  OP  CONTENTS 


Chapter  PaSe 

A  CKNOWLEDGEMENTS  i i 

LIST  OF  TABLES  lv 

ABSTRACT  v 

I            INTRODUCTION  1 

II            METHOD  13 

Research  Sample  13 

Procedure  1^ 

Phase  I:     Correlation  Study  14 

Phase  II:     Discrimination  Study  15 

III            RESULTS  17 

Phase  I:     Correlation  Study  17 

Phase  II:     Discrimination  Study  20 

IV           DISCUSSION  26 

Phase  I:     Correlation  Study  26 

Phase  II:     Discrimination  Study  29 

Other  Observations  and  Results  30 

V            RECOMMENDATIONS  FOR  FUTURE  STUDY  35 

REFERENCES  37 

BIOGRAPHICAL  SKETCH  39 


iii 


LIST  OP  TABLES 


Number  Page 

1  Split-Half  Correlation  Data    9 

2  Test-Re  test  Correlation  Data  10 

3  Intertest  Correlation  Studies    11 

4  Distribution  of  Correlation  Coefficients 
Series  A,  B,  and  C,  E-8  and  E-9,  Selectees 

and  Nonselectees  18 

5  Distribution  of  Correlation  Coefficients 
Three  Series,  E-8  Only,  Selectees  and 
Nonselectees    19 

6  Distribution  of  Correlation  Coefficients 
Three  Series,  E-9  Only,  Selectees  and 
Nonselectees    21 

7  Descriptive  Statistics  for  Differences  of 

Mean  Scores  .   .   .  22 

8  Competing  Groups  with  t  Value  of  Difference 
Greater  than  I.96  Listed  in  Diminishing 
Difference  Order    23 

9  Median  Values  of  Intrasection  Reliability 
Based  Upon  Competing  Groups  Used  in  Corre- 
lation Study  28 


10  The  Correlation  Coefficients  Between  the 
Scores  Achieved  by  the  Several  Specialties 

in  the  Various  Groups  of  E-8  Candidates  ...  32 

11  Correlation  Coefficients  Between  Scores 
Achieved  by  Ratings  in  the  Groups  of  E-9 
Candidates  from  Series  A  and  B.  33 

12  Correlation  Coefficients  Between  Scores 
Achieved  by  Ratings  in  the  Groups,  Across 
Paygrades,  Using  Candidates  in  Series  A  and  B  33 


iv 


Abstract  of  Dissertation  Presented  to  the  Graduate 
Council  of  the  University  of  Florida  in  Partial 
Fulfillment  of  the  Requirements  for  the 
Degree  of  Doctor  of  Education 

TWENTY  QUESTIONS  AS  A  PREDICTOR  OF  TEST  SCORES 
OR  BOARD  SELECTIONS  FOR  ADVANCEMENT 

By 

Andrew  N.  Dow 
August  1977 

Chairman:     Vynce  A.  Hines 

Major  Department:     Foundations  of  Education 

The  validity  of  a  special  examination  as  a  part  of  the 
procedure  used  to  screen  certain  naval  personnel  who  are  can- 
didates for  advancement  or  promotion  was  investigated.  The 
study  used  the  same  data  for  a  pair  of  independent  analyses: 
(l)  the  use  of  correlation  to  determine  how  well  the  part 
test  predicts  performance  on  the  other  parts  of  the  screening 
examination  and  (2)  an  evaluation  of  the  discriminality  of 
the  part  test  as  evidenced  by  its  ability  to  distinguish 
group  membership.     Neither  of  these  studies  produced  defini- 
tive,  positive  results.     However,  in  the  course  of  handling 
the  data,  the  author  observed  that  there  were  score  patterns 
related  to  the  occupational  specialty.     In  order  to  confirm 
or  deny  intraoccupation  consistency,  a  series  of  correlation 
coefficients  was  computed.     These  coefficients  were  large 


v 


enough  to  suggest  further  study  of  several  phases  of  the 
phenomenon  and  the  possibility  of  using  this  type  of  test 
to  predict  suitability  for  a  general  occupational  field.  A 
brief  history  of  the  use  of  formal  examinations  to  select 
people  for  jobs  and  promotions  introduces  the  study.  The 
part  test  studied  is  based  upon  the  techniques  employed  by 
William  Haney  in  his  Uncritical  Inference  Test. 


CHAPTER  I 
INTRODUCTION 

The  first  use  of  a  formal  test  to  help  in  selecting  a 
person  for  a  special  job  is  unrecorded.     DuBois  (1965)  tells 
of  the  system  of  tests  that  the  Chinese  government  was  using 
approximately  4,000  years  ago.     Agents  of  the  emperor  examined 
the  career  officials  every  3  years;  after  three  examinations, 
the  official  was  either  promoted  or  dismissed.     Details  of 
method  and  content  have  been  lost  in  the  mists  of  time.  How- 
ever, it  is  certain  that  in  1115  B.C.  a  formal  system  of  ex- 
amining candidates  for  office  was  in  use.     These  tests  covered 
five  basic  arts— music,  archery,  horsemanship,  writing,  and 
arithmetic . 

There  is  evidence  that  the  content  and  approach  of  the 
Chinese  examinations  did  not  remain  fixed.     As  an  example, 
after  the  advent  of  Confucius,  examinations  were  based  upon 
his  classical  works  and  had  been  expanded  to  include  "...skill 
in  the  rites  and  ceremonies  of  public  and  social  life ...  famil- 
iarity with  the  geography  of  the  empire,  civil  law,  military 
matters,  agriculture,  and  the  administration  of  revenue" 
(DuBois,   1965,  P.  31).     During  the  l4th  century  A.D.,  each 
examinee  was  shut  into  a  private  cell  to  write  a  poem  and  one 
or  two  essays  as  assigned  by  the  chancellor.     The  poems  and 
essays  were  screened  by  ^the  chancellor  and  his  staff  who  were 
looking  for  the  \%  with  the  most  beautiful  penmanship  and  most 


1 


graceful  diction.     The  successful  candidates  then  competed 
among  their  peers  (penmanship  was  not  a  considered  factor  this 
time)  to  be  in  the  top  \%  of  the  original  1%  (the  best  1  in 
10,000);  these  in  turn  competed  for  the  top  3$  of  their  group. 
The  successful  were  among  the  top  3  of  each  1,000,000  candi- 
dates who  started;  they  were  then  assigned  their  jobs  by  lot. 
They  also  had  the  privilege  of  competing  a  fourth  time,  the 
successful  becoming  the  emperor's  historians  and  poets,  and 
in  some  cases,   provincial  examining  chancellors.     This  basic 
program  was  abolished  in  1905  when  the  Chinese  realized  that 
the  examinations  had  become  counterproductive. 

India  was  the  route  by  which  civil  service  tests  came  to 
Great  Britain,  at  the  suggestion  of  men  who  knew  of  the  Chi- 
nese testing  program.     The  members  of  our  Congress  who  set 
up  the  U.S.  Civil  Service  studied  the  Chinese  system;  one  of 
them,  Jenckes  (1868),  wrote  12  pages  on  the  civil  service  of 
China.     Obviously,  the  written  examinations  given  by  the 
U.S.  Civil  Service  Commission  were  originally  inspired  by 
the  Chinese  program,  even  though  their  content  and  structure 
were  new. 

The  means  by  which  the  promotion  examination  arrived  at 
the  military  and/or  naval  personnel  office  probably  can  be 
derived  from  federal  archives;  however,  that  route  is  not 
pertinent.     We  do  know  that  the  Navy  has  used  some  type  of 
promotion  examination  for  more  than  60  years.     The  Van  der  Veer 
revision  of  The  Bluejacket's  Manual  states: 

"  —  —         ■  <j —  —  ■   . 

Permanent  appointments  will  be  issued  by  the  Bureau 
of  Navigation  to  chief  petty  officers  only  after 


their  fitness  for  promotion  has  been  shown  before 
a  board  consisting  of  three  officers  not  attached 
to  the  ship  on  which  the  candidate  is  serving.... 
The  examination  shall  show  that  the  candidate  is  in 
all  respects  fitted,  under  such  conditions  as  the 
Bureau  of  Navigation  may  prescribe,  to  fill  the 
rating  in  which  he  seeks  a  permanent  appointment. 
(1916,  p.  64) 

The  1939  revision  of  The  Bluejackets '  Manual  (United  States 
Navy)  delineates  the  requirements  for  advancement  in  rating 
and  includes  "pass  satisfactorily  a  technical  examination" 
(p.  159)-    The  l4th  edition  of  The  Bluejackets '  Manual 


(United  States  Navy,  1950,  p.  242)  centralizes  control  of  the 
examination:     "Before  enlisted  personnel  may  be  advanced,  they 
must :.. .have  passed  satisfactorily  a  professional  examination 
for  the  rate  involved  as  prescribed  in  the  Manual  of  Qualifi- 
cations for  Advancement  in  Rating,  NAVPERS  I8068"  (promulgated 


by  the  Bureau  of  Naval  Personnel  which  is  the  successor  to 
the  aforementioned  Bureau  of  Navigation). 

According  to  the  19th  edition  of  The  Bluejackets '  Manual : 


Until  1958,  a  person  who  had  advanced  to  chief  petty 
officer  (E-7)  had  gone  as  high  as  an  enlisted  man 
could  go.     Then  the  grades  of  senior  chief  petty 
officer  (E-8)  and  master  chief  petty  officer  (E-9) 
were  established  to  give  additional  recognition 
and  prestige  to  men  with  outstanding  technical 
superiority,  leadership,  and  supervisory  ability. . . . 
Candidates  for  E-8  and  E-9,  having  met  all  the  re- 
quirements and  passed  the  examinations,  will  have 
their  service  records  closely  compared  and  checked 
by  a  selection  board  of  officers  in  the  Navy  Depart- 
ment.    (Naval  Institute  Press,  1973,  P.  563) 

Thus,  two  "supergrades "  were  established,  and  the  promotions 

were  to  be  made  by  a  board  from  among  those  who  surpassed  a 

cutting  score  on  an  examination.    At  first  the  responsibilities 

and  privileges  of  the  new  grades  were  not  well  defined,  and 


there  was  considerable  discussion  about  the  specifics  of  the 
screening  examinations. 

The  first  candidates  for  the  supergrades  were  examined 
in  August  1958.     Their  tests  covered  three  broad  areas- 
Technical,  Military/Leadership,  and  Comprehension/Reasoning 
with  60,  20,  and  70  items,  respectively.     Thus,   it  began  with 
80  content  items  and  70  aptitude  items.     The  Military/Leader- 
ship  items  and  the  Comprehension/Reasoning  items  were  common 
for  all  occupational  specialties;  the  technical  items  were 
specific  for  the  individual  rating  and  were  based  upon  the 
qualifications  set  forth  in  the  Manual  of  Qualifications  for 
Advancement  in  Rating  (Bureau  of  Naval  Personnel,  current 
series).     The  Comprehension/Reasoning  section  was  composed  of 
three  kinds  of  items:     verbal  analogy,  arithmetical  reasoning, 
and  mechanical  comprehension  (Macaluso  &  Dow,   1969,  p.  155). 

After  a  minor  shift  in  emphasis  on  the  1959  examinations, 
the  i960  series  appeared  in  a  format  of  30  items  in  each  of 
three  areas — Technical  (professional),  Military,  and  Super- 
vision— and  20  items  in  each  of  three  aptitude  areas — Mechani- 
cal Comprehension,  Verbal  Analogy,  and  Arithmetical  Reasoning. 
This  layout  was  used  to  test  all  candidates  for  Senior  or 
Master  Chief  Petty  Officer  through  1969,  and  some  candidates 
for  several  years  after  that  (Macaluso  &  Dow,  1969).  The 
E-8/E-9  advancement  examinations  given  during  February  1970 
were  split  about  two  to  one,  with  the  established  format  being 
used  for  most  ratings  an/3  a  new  format,   to  be  described  later, 
being  used  for  approximately  one-third,  or  22  ratings  (Dow  & 
Macaluso,  1970). 


5 


After  several  years  of  change,  nonconsensus,  and  inde- 
cision, the  Navy  set  up  a  task  force  under  Admiral  Crutchfield 
to  devise  a  set  of  definitions  for  Senior  Chief  Petty  Officers 
and  Master  Chief  Petty  Officers  (Dow  &  Macaluso,  1970,  p.  99). 

Admiral  Crutchfield 1 s  team,  which  was  composed  of 
knowledgeable  officers  and  the  Master  Chief  Petty 
Officer  of  the  Navy,  D.  Black,  surveyed  the  entire 
Navy  to  find  out  what  E-8s  and  E-9s  really  were-- 
how  they  were  being  used,  what  the  Navy  expected 
of  them,  and  how  the  Navy  would  like  to  be  able 
to  use  them.     (Macaluso  &  Dow,  1969,  p.  158) 

The  published  statement  released  by  the  Crutchfield  group 

was  quoted  in  part  by  Macaluso  and  Dow,  1969,  p.  159.  The 

definitions  for  senior  and  master  chief  are  given  in  full: 

o 

Senior  Chief  Petty  Officer  (E-8) 

The  role  of  the  Senior  Chief  Petty  Officer  is  that 
of  an  ENLISTED  technical  or  specialty  supervisor. 
He  functions  as  an  enlisted  technical  or  specialty 
expert  within  his  rating,  serving  as  he  does,  as 
the  second  senior  enlisted  petty  officer  within 
that  rating.     His  primary  responsibility  is  to  bring 
to  bear  his  broad  training,  knowledge  and  experience 
in  providing  direction  and  supervision  to  enlisted 
personnel  engaged  in  performing  the  functions  and 
tasks  associated  with  the  work  for  which  his  rating 
is  responsible.     He  plans  and  administers  on-the- 
job  and  other  training  programs  for  subordinates 
serving  in  his  specialty.     On  occasion,  he  func- 
tions outside  the  area  of  his  rating  in  areas  of 
leadership,  administration  and  supervision  as  a 
senior  enlisted  advisor  in  matters  concerning  en- 
listed personnel,  but  the  main  thrust  of  his  super- 
visory and  leadership  ability  lies  in  the  area  of 
broad  technical  or  specialty  expertise  related  to 
his  rating.     In  terms  of  enlisted  military  seniority, 
he  is  second  only  to  the  Master  Chief  Petty  Officer. 

Master  Chief  Petty  Officer  (E-9) 

The  role  of  the  Master  Chief  Petty  Officer  is  that 
of  the  senior  ENLISTED  technical  or  specialty  ad- 
ministrator within  his  rating.     The  Master  Chief 
Petty  Officer  is  the  senior  enlisted  grade  in  terms 
of  military,   technical,  supervisory  and  administra- 
tive responsibility  and  expertise.    His  primary 
function  is  to  bring  to  bear  his  extensive  training, 
knowledge  and  experience  in  providing  senior-level 


6 


enlisted  supervision  and  administration  to  his 
entire  rating,  thereby  insuring  maximum  effi- 
ciency of  the  work  force  and  equipment  assigned 
in  the  effective  accomplishment  of  the  functions 
and  tasks  for  which  his  immediate  organization  is 
responsible.     He  is  responsible  for  organizing, 
directing,  and  coordinating  the  various  programs 
implemented  for  the  purpose  of  instruction  and 
supervision  of  subordinates.     In  units  or  at  ac- 
tivities where  the  situation  requires,   the  Master 
Chief  Petty  Officer,  in  ■addition  to  his  normal 
functions,  supplements  the  officer  corps  in  the 
overall  supervision  and  administration  of  the  men 
and  equipments  associated  with  the  functioning  of 
the  organization  to  which  assigned,  whether  or 
not  related  to  his  rating.     In  addition  to  func- 
tioning within  his  specialty  as  described,  the 
Master  Chief  Petty  Officer  also  can  be  expected, 
when  so  assigned,   to  be  capable  of  functioning 
effectively,  outside  his  particular  area,  in  areas 
of  leadership,  administration  and  supervision,  as 
a  senior  enlisted  advisor  for  the  command  in  which 
serving  in  matters  concerning  enlisted  personnel. 

The  report  presented  by  Macaluso  and  Dow  (1969,  p.  160)  dis- 
cusses these  definitions  as  follows: 

Other  than  military  rank  and  precedence,   the  prin- 
cipal difference  between  the  two  is  breadth.  In 
the  case  of  the  E-8,  the  definition  states  "On 
occasion,  he  functions  outside  the  area  of  his 
rating  in  areas  of  leadership,  administration,  and 

supervision  "  While  for  the  E-9,  it  states: 

"...the  Master  Chief  Petty  Officer  can  be  expected, 
when  so  assigned,  to  be  capable  of  functioning  ef- 
fectively, outside  his  particular  area,  in  areas  of 
leadership,  administration,  and  supervision..."  and 
"...the  Master  Chief  Petty  Officer,  in  addition  to 
his  normal  functions,  supplements  the  officer  corps 
in  the  overall  supervision  and  administration  of  the 
men  and  equipments  associated  with  the  functioning 
of  the  organization  to  which  assigned,  whether  or 
not  related  to  his  rating."    These  quoted  phrases 
clearly  indicate  the  increase  in  breadth  when  the 
chief  advances  to  E-8,  and  thence  to  E-9. 

The  exam  center  was  also  given  the  task  of  propos- 
ing a  change  in  the  makeup  of  the  advancement  ex- 
aminations for  the  candidates  for  E-8  and  E-9.  All 
officers  and  professional  civilians  were  encouraged 
to  participate  by  submitting  suggestions  and  the 
rationale  for  them.     Technical  or  professional  items 
were  necessarily  included  in  the  recommendations 


7 


since  the  definitions  stressed  technical  expertise. 
Beyond  that  the  order  was  to  find  the  man  who  has 
the  level  and  kind  of  mentality  to  be  able  to  broad- 
en himself  as  well  as  keep  abreast  of  the  latest 
developments,  and  who  is  able  to  communicate  effec- 
tively with  superiors  and  subordinates  as  well  as 
equals.     The  term  "broaden"  as  used  here  includes 
more  than  a  wider  knowledge  of  his  own  field  and 
the  related  fields  in  which  he  is  expected  to  super- 
vise or  administer;  it  includes  developing  new  and 
more  sophisticated  approaches  to  technical  as  well 
as  human  relations  problems.     The  kinds  of  test  items 
that  were  proposed  to  do  this  job  were  varied,  in- 
deed.    They  included  new  kinds  of  problem-solving 
questions,  aptitude  items  tailored  to  the  ratings,  a 
communicability  section  consisting  of  a  reading  se- 
lection and  a  series  of  questions  that  can  be  ans- 
wered with  a  minimum  of  inference.  Naval  knowledge 
items,  Naval  terminology,  verbal  analogies,  and 
others.     It  was  also  suggested  that  the  examination 
should  be  lengthened  to  250  items  from  the  present 
150,  with  50  of  the  additional  items  being  strictly 
for  research  and  development,  not  counting  toward  the 
candidates  scores. 

The  Bureau  of  Naval  Personnel  accepted  some  of  the 
suggestions  in  the  interim  formats  that  were  adopted 
for  the  1970  examinations  in  those  ratings  with  new 
qualifications  in  Change  4  of  the  Quals  Manual.  For 
the  first  time,  E-8  and  E-9  have  different  formats, 
each  totaling  150  items.     The  two  formats  are  similar 
as  you  will  see.     The  E-8  format  consists  of  three 
special  areas  and  three  general  areas.     The  E-8  spe- 
cial areas  are  (a)  Technical,   (b)  Military  and  Col- 
lateral Duties,  and  (c)  Special  Aptitudes.  The 
general  areas  on  the  new  E-8  are  (a)  Supervision, 
(b)  Communication,  and  (c)  Problem  Solving.     On  the 
other  hand,  the  new  E-9  has  only  two  special  areas, 
(a)  Technical  and  (b)  Military  and  Collateral  Duties, 
but  it  does  have  three  general  areas  with  one  dif- 
ference from  the  E-8.     These  general  areas  are  (a) 
Administration,   (b)  Communication,  and  (c)  Problem 
Solving.     In  some  instances,  the  number  of  items  in 
an  area  differs  between  the  two  levels.     E-8  has  50 
technical  items,  while  E-9  has  45;  E-8  has  20  military 
and  collateral  duties  items  with  E-9  increased  to  30; 
E-8  has  20  special  aptitude  Items,  but  E-9  has  none. 
In  general  areas,  both  have  20  communication  items 
and  20  problem  solving  items,  but  E-8  has  20  super- 
vision questions  against  35  administration  items  on 
the  E-9.     In  summary,,   the  principal  differences, 
other  than  in  numbers  of  items  per  section,  are  the 
lack  of  a  special  aptitude  section  on  the  E-9  exams, 
and  the  E-8  supervision  section  in  place  of  the  ad- 
ministration items  found  on  the  E-9. 


Just  as  in  the  case  of  the  special  aptitude,  problem 
solving,  and  administration  sections,  the  Bureau  of  Naval 
Personnel  gave  little  guidance  as  to  what  should  actually 
comprise  the  Communication  section.     A  group  working  at  the 
Naval  Examining  Center  (a  field  activity  of  the  Bureau  of 
Naval  Personnel,   then  located  at  the  Naval  Training  Center, 
Great  Lakes,   111.)  chose  a  structure  devised  by  Haney  (1953, 
1958)  and  used  in  The  Uncritical  Inference  Test  (1955)  as  the 
basis  for  the  Communication  section  of  the  examinations. 
Roughly,  Haney' s  test  is  composed  of  a  short  story  of  40  to 
200  words,  followed  by  a  series  of  true-false-?  questions. 
The  section  based  upon  Haney' s  structure  is  the  principal 
concern  of  the  studies  reported  in  this  paper. 

Trie  directions  for  taking  the  Uncritical  Inference  Test 

are 

1.  You  will  read  a  brief  story.     Assume  that  all 
of  the  information  presented  in  the  story  is  defi- 
nitely accurate  and  true .     Read  the  story  carefully. 
You  may  refer  back  to  the  story  whenever  you  wish. 

2.  You  will  then  read  statements  about  the  story. 
Answer  them  in  numerical  order.     DO  NOT  GO  BACK  to 
fill  in  answers  or  to  change  answers.     This  will 
only  distort  your  test  score. 

3.  After  you  read  carefully  each  statement,  deter- 
mine whether  the  statement  is: 

a.  "T"--meaning:     On  the  basis  of  the  infor- 
mation presented  in  the  story  the  statement 
is  DEFINITELY  TRUST 

b.  "P"--meaning:     On  the  basis  of  the  Infor- 
mation presented  in  the  story  the  statement 
is_  DEFINITELY  FALSE. 

c.  "? "--meaning?    The  statement  MAY  be  true 
(or  false)  but  on  the  basis  of  the  information 


presented  in  the  story  you  cannot  be  defi- 
nitely certain.     ( If  any  part  of  the  state- 
ment; is  doubtful,  mark  the  statement  "?.") 

4.     Indicate  your  answer  by  circling  either  "T" 
or  "F"  or  "? "  opposite  the  statement.  (Haney, 
1955) 

When  Haney  (1953)  developed  the  original  two  forms  of 
the  Uncritical  Inference  Test,  he  investigated  reliability 
using  both  the  split-half  correlation  method  and  the  test- 
re  test  procedure.  The  two  forms  of  the  test  were  designated 
"A"  and  "B, "  each  consisting  of  67  true-false-?  statements. 
The  results  of  his  computations  are  summarized  in  Tables  1 
and  2. 


Table  1 
Split-Half  Correlation  Data 


Form 

Correlation 

Spearman-Brown  Corrected 

N 

Coefficient 

Correlation  Coefficient 

A 

164 

.762 

.928 

B 

136 

.818 

•  9^7 

10 

Table  2 
Test-Re test  Correlation  Data 


Group 

N 

First 
Form 

Training  After 
Fir^t  Fmrn 

1   -Li   O          i.  Ul  111 

Second 

r  \Ji  in 

Correlation 

Exp  1 

61 

A 

Full 

B 

.739 

Exp  2 

67 

B 

Full 

A 

.582 

Exp  3 

35 

A 

Limited 

B 

.607 

Con  1 

45 

A 

None 

B 

.666 

Con  2 

42 

B 

None 

A 

.559 

Con  3 

61 

A 

None 

3 

.677 

The  data  presented  in  Tables  1  and  2  indicate  that  the 
two  forms  of  the  Uncritical  Inference  Test  (Haney.   1953-  1955, 
1958)  have  considerable  internal  consistency  and  are  moder- 
ately equivalent  to  each  other. 


The  uniqueness  of  a  trait,  if  the  tests  are  measuring  a 
trait,  was  also  investigated.     The  correlation  coefficients 
between  the  two  forms  of  the  Uncritical  Inference  Test  and 
the  Cooperative  Reading  Comprehension  Test  (1940,   1951),  the 
Ohio  State  University  Psychological  Test  ( 1919-1947) ,  and 
the  Watson-Glaser  Tests  of  Critical  Thinking  Appraisal  (1942), 
respectively,  were  calculated.     The  data  from  these  studies 
are  summarized  in  Table  3. 

Tne  data  in  Table  3  indicate  that  scores  on  the  Uncriti- 
cal Inference  Test  are  relatively  independent  of  scores  on  the 
three  tests  used  as  criteria.     Combined  with  the  data  from 


11 

Tables  1  and  2,  the  data  from  Table  3  show  that  there  is  a 
distinctive  trait  measured  by  the  Uncritical  Inference  Test. 


Table  3 
Intertest  Correlation  Studies 


Criterion 
Test 

U-C-I  Test 

*-    Wl  111 

M 

Correlation 
ooei i lcient 

Coop  Reading 

A 

124 

.267 

B 

79 

.198 

OSU  Psy 

A 

50 

.325 

B 

31 

.180 

Watson-Glaser 

A 

47 

.240 

B 

115 

.249 

The  Communication  sections  of  the  examinations  used  for 
the  screening  of  candidates  for  promotion  to  Senior  Chief 
Petty  Officer  (E-8)  and  Master  Chief  Petty  Officer  (E-9)  are 
similar  to  the  Uncritical  Inference  Test  and  are  constructed 
in  the  same  general  form.     The  directions  to  the  candidates 
are  essentially  the  same  as  those  for  Haney's  (1955)  Uncriti- 
cal Inference  Test  cited  earlier.     In  each  instance,   the  Navy 
versions  of  the  test  follow  the  story  with  exactly  20  true- 
false-?  type  questions. 

The  raw  score  is  the  number  of  correct  responses.  Each 
raw  score  is  converted  to  a  standard  score  based  upon  the  per- 
formance of  the  other  candidates  in  his  rate  (occupational 


12 

specialty  and  paygrade).     The  standard  score  is  the  result  of 
subtracting  the  mean  from  the  raw  score,  dividing  the  dif- 
ference by  the  standard  deviation,  adding  5  to  the  quotient, 
and  multiplying  the  sum  by  10,  using  the  descriptive  statis- 
tics from  the  candidate's  own  competing  group  (examination 
rate).     Each  candidate  receives  a  profile  based  upon  his 
standard  score  on  each  subtest  (section)  so  that  he  may  com- 
pare his  performance  on  each  with  that  of  his  immediate  peers. 
The  selection  board  receives  a  standard  score  based  upon 
total  test  performance  for  each  candidate  whose  score  meets 
the  prescribed  standard  for  submission.     The  selection  board 
does  not  know  the  candidates'  scores  or  standings  on  the  sub- 
tests; nor  does  the  board  know  the  standards  used  by  any  pre- 
vious board  convened  for  a  similar  purpose. 

The  present  study  investigates  the  possible  predictive 
value  of  the  Communication  section  with  other  parts  of  the 
examination  and  with  the  independent  decisions  of  the  selec- 
tion board. 


CHAPTER  II 
METHOD 

This  study  has  two  independent  aspects — Phase  I  is  an 
inquiry  into  the  relationship  between  the  score  on  the  Com- 
munication subtest  and  that  made  on  the  rest  of  the  examina- 
tion; Phase  II  is  an  investigation  into  the  possibility  that 
the  score  on  the  Communication  subtest  of  the  E-8  and  E-9 
advancement  screening  examinations  can  be  used  to  discrimi- 
nate between  the  selectees  and  nonselectees  of  the  selection 
board.     Thus,  there  is  a  correlation  study  (Phase  I)  and  a 
related,  yet  independent,  discrimination  study  (Phase  II), 
which  were  based  upon  data  from  the  same  persons  and  groups 
of  persons. 

Research  Sample 
As  the  candidates  for  advancement  compete  only  within 
their  own  group  or  specialty,  a  representative  group  of  spe- 
cialties, or  ratings,  was  used  for  the  study.     The  chosen 
ratings  encompassed  deck  (seamanship),  ordnance,  medical, 
mechanical  maintenance,  mechanical  operation,  clerical,  elec- 
trical and  electronic  occupations.     Some  of  the  specialties 
are  primarily  with  the  aviation  group,  some  are  definitely 
waterborne,  and  a  few  may  expect  both  aviation  and  surface 
assignments.     Ratings  with  small  populations  were  generally 
avoided.     Some  ratings  with  large  populations  were  not  used 
because  there  were  matching  large  groups  in  similar  occupatior 


13 


14 

that  were  included.     In  the  actual  studies,  except  for  one 
rating  with  an  unusually  large  group  of  selectees,  all  quali- 
fied selectees  were  included.     For  the  one  large  group, 
qualified  selectees  were  alternately  included  and  rejected, 
producing  a  sample  half  the  size  of  the  total  group  of  quali- 
fied selectees.     In  the  relatively  small  population  ratings, 
all  qualified  nonselectees  were  used;  in  the  larger  ratings, 
a  reasonably  random  sample  of  qualified  nonselectees,  usually 
numbering  from  90  to  150,  was  used. 

The  terms  "qualified  selectee"  and  "qualified  nonselectee 
have  been  used,  but  not  defined.     For  the  purposes  of  this 
study,   "qualified"  means  that  the  individual  took  the  regular 
examination  on  the  regular  date  for  the  particular  advance- 
ment cycle  and  is  not  missing  any  of  the  scores  that  are  used 
in  this  study.     "Selectee"  means  that  the  Individual  was  ' 
chosen  by  the  selection  board  to  be  advanced;  "nonselectee" 
means  that  the  board  reviewed  the  individual's  record,  but 
did  not  choose  him  for  advancement.     There  is  a  third  cate- 
gory,  "Board  Ineligible,"  which  indicates  that  these  persons' 
test  scores  were  too  low  to  have  their  records  sent  to  the 
selection  board.     The  test  scores  and  other  performances  of 
these  individuals  were  not  included  in  this  study. 

Procedure 
Phase  I:     Correlation  Study 

In  the  correlation  study,   the  communication  scores  were 
paired  with  the  total  scores  on  the  rest  of  the  advancement 
screening  examination,  individual  by  individual  in  the  same 


15 

petty  officer  occupational  specialties  and  paygrades  as  used 

also  in  the  discrimination  study,  and  r     '  s  computed  for  the 

xy 

bivariate  distribution  data. 
Phase  II:     Discrimination  Study 

In  the  discrimination  study,   the  communication  scores 
of  selectees  were  contrasted  with  the  communication  scores 
of  nonselectees  from  the  same  groups.     A  null-hypothesis  ap- 
proach was  used  to  investigate  the  possibility  that  these  two 
groups--selectees  and  nonselectees — really  perform  differ- 
ently on  the  communication  subtest  by  assuming  no  difference 
in  accord  with  the  null  hypothesis. 

Computations  were  performed  on  a  Monroe  Beta  326  Scien- 
tist programmable  calculator  with  a  Monroe  392  portable  tape 
drive.     The  calculations  were  done  with  a  program  derived 
from  Monroe  program  9211W,  Linear  Regression  (Monroe  Regres- 
sion Analysis  Pak).     Program  921 1W  calculates  the  correlation 
coefficient,  using  this  formula 

2XY-  2XSY 


r  =   N 


p..  £»')  („._  !»•) 


and  concurrently,   the  mean  and  standard  deviation  cf  the  Xs 
and  Ys.     To  use  this  program,  the  operator  puts  the  proper 
tape  into  the  392  and  reads  it  into  the  program  storage  bank 
in  the  326,  according  to  the  directions  in  the  manual  (Monroe 
326  Manual).     Following  tyie  directions  that  accompany  the 
program  (Monroe  Regression  Analysis  Pak),   the  operator  enters 


16 

the  raw  data  into  the  326 's  Sun^strand-type  keyboard.  When 
all  values  In  a  set  have  been  entered,  the  operator,  follow- 
ing the  program  directions,  instructs  the  calculator  to  dis- 
play  the  desired  results. 

The  readily  available  data  did  not  include  one  of  the 
experimental  values,  but  the  available  values  made  it  possible 
to  calculate  the  missing  value.     Specifically,   the  correlation 
study  was  set  up  to  correlate  the  score  on  a  single  subtest 
with  the  sum  of  the  scores  on  the  remaining  subtests,  and, 
while  the  total  scores  (all  subtests)  and  the  part  (subtest) 
scores  were  given,   the  sum  of  the  remaining  parts  was  not  pro- 
vided directly.     The  program  9211W  was  modified  so  that  in- 
stead of  entering  X  and  Y,  X  and  Z  were  entered,  and  Y  was 
calculated  by  subtracting  X  from  Z,   (Y  =  Z  -  X).     The  raw 
score  on  the  Communication  subtest  became  X,  the  total  raw 
score  became  Z,  and  the  total  of  the  scores  on  the  remaining 
subtests  became  Y.     The  modified  program  9211W,  after  per- 
forming the  subtraction,  calculates  the  correlation  coeffi- 
cient, the  means,  and  the  standard  deviations  based  on  X  and 
Y  the  same  as  the  original  program  9211W. 

For  each  rating  used  in  the  study,  the  X  and  Z  values 

for  each  selectee  were  entered,  and  the  r     ,  X,  Y,  SD  ,  and 

xy'     '     '  x' 

SDy  were  calculated  for  the  selectees  from  that  rating.  The 
same    was  then  done  for  the  nonselectees  in  that  rating. 
X"g  -  Xn,  SEdlff,  and  t  values,   (Xg  -  Xn)  -1  SEdlff,  were  cal- 
culated for  each  rating  used. 


CHAPTER  III 
RESULTS 

Phase  I:     Correlation  Study 
The  relation  of  the  scores  on  the  Communication  subtest 
of  the  advancement  screening  examination  to  the  sum  of  the 
scores  on  the  remaining  subtests  of  the  same  examination  will 
be  presented  first.     For  the  selected  specialties  scattered 
across  two  paygrades  and  three  yearly  examination  cycles 
(series)  designated  A,  B,  and  C  with  A  the  most  recent  and  C 
the  earliest,  237  Pearson  product-moment  correlation  coeffi- 
cients were  calculated.     Table  4  is  a  frequency  distribution 

of  t.nese  237  correlation  coefficients  (r  ). 

xy' 

As  noted  in  Table  h,  the  median  of  these  coefficients  is 
+  .19^-.     Approximately  one-half  of  them  fall  within  the  range 
of  -*-.100  to  +.299  and  three-quarters  fall  between  .000  and 
+.400.     Six  of  the  coefficients  fall  in  the  range  of  +.650  to 
+.950;  not  one  of  them  was  based  upon  a  population  as  large 
as  10.     The  small  size  of  the  populations  for  the  larger 
coefficients  of  correlation  casts  considerable  dcubt  upon 
their  significance. 

Table  5  is  a  distribution  of  the  correlation  coefficients 
based  upon  the  results  of  candidates  for  advancement  to  E-8. 
There  is  a  separate  distribution  for  each  of  the  three  series 
and  an  aggregate  distribution  to  cover  the  three  series. 


17 


Table  4 

Distribution  of  Correlation  Coefficients 
Series  A,  B,  and  C  E-8  and  E-9,  Selectees  and  Nonselectees 


No.  of  Correlation 
Cases  Coefficient 


2  .800  -  .9^9 

1  .750  -  .799 

1  .700  -  .749 

2  .650  -  .699 
2            .600  -  .649 

2  .550  -  .599   

4  .500  -  .5^9 

5  .450  -  .499 

5  .400  -  .449   

15  .350  -  .399  ! 

11  .300  -  .349   

29  .250  -  .299 

35  .200  -  .249  ;  |  j 

35  .150  -  .199  Median  -.194  49 \k%  76 \k%  90! 1% 
17  .100  -  .149   1  1  1 

19  .050  -  .099 

19  .000  -  .049   I 

11  -.050  --.001 

7  -.100  --.051   1 

2  -.150  --.101  ' 

7  -.200  --.151 

1  -.250  --.201 

2  -.300  --.251 
2  -.350  --.301 
1  -.999  —.351 


o  c 

rH  T3 

CM  S 


CO 

0) 
CO 

cc 
o 

o 


OJ  H  CW  LfW  O  O*.  CM  VO  OJ  O  O  CO  LP, 

rH        CvJ  CVJ  CvJ  l — i  i — I  i — I  i — i 


On  C 
iH  -D 
CM  S 


CO 
0 
•H 


i 


rH        CM  CO  OJ  rH  OO  t  ^"  LT\  LCW 


.o  PQ 


cm  C 

CM  S 


CD 
•H 

CD 

co 


I 


■H  i-i  CM  .— i  CO -3"  Cm  VO  LPvCO  O  0>  I 


CM 


LO 


O 

DQ 
CP 
•H 
Th 
0) 
CO 


cvj 


LT\  C 
COTD 
rH  S 


:m  on  novo  in  cnoovo 


c  -P 
o  c 

•H  d) 
-P  iH 
CO  O 
rH  tH 


o 


0\0\<J\0\<J\ChO\<J\CT\0\<J\Cfr<J\0\0\0\<?\r-i 

ow  aw  aw  aw  aw  aw  aw  aw  o 
a> t — vo vo  m irw ^rnrnajcvjHHooo 


i  i  i 


i  i  i  i  i   i  i  i  i 


i 

i  i 


OOOOOOCOOOOOOOOOOOA 

OLTiOLTioinoinoiriOLnoLnoinoch 
co  t*-  n-vo  vo  in  low  ^rmrocviojHHOOOi 


20 

Table  6  is  analogous  to  Table  5  but  is  based  upon  the 
data  derived  from  the  records  of  candidates  for  advancement 
to  E-9. 

Tne  Communication  subtest  for  each  series  and  paygrade 
is  unique;  the  instructions  and  basic  concept  remain  the 
same,  but  the  actual  story  and  questions  are  different  for 
each.     The  fluctuations  in  median  coefficients,  as  shown  in 
Tables  5  and  6,  probably  are  the  result  of  instrument  instabil- 
ity combined  with  interseries  population  changes. 

Phase  II;     Discrimination  Study 

The  discrimination  study  was  based  upon  the  descriptive 
statistics  of  selectees  and  nonselectees  in  the  ratings  or 
occupational  specialties  used  in  the  correlation  study.  For 
each  rating  In  tne  study,  the  arithmetic  mean  and  standard 
deviation  of  the  selectee  and  nonselectee  groups  for  a  par- 
ticular series  were  calculated  concurrently  with  the  correla- 
tion coefficient  for  the  pair.     Tnus  the  two  studies  used  the 
same  raw  data.     The  results  of  the  discrimination  study  are 
found  in  Table  7. 

Table  7  presents  a  statistical  summary  of  the  differences 
of  mean  scores  by  series  and  paygrade.     As  previously  men- 
tioned, the  three  series  of  examinations  are  designated  A,  B, 
and  C;  the  paygrades  E-8  and  E-9  are  indicated  simply  as  8 
and  9. 

The  mean  raw  score  on  the  Communication  subtest  was  cal- 
culated for  each  rate's  selectees  and  nonselectees  individually 
at  the  time  each  r      was  calculated.     The  mean  Communication 


m 

0 

m 

o 

<^ 
o 


C 

LP. 

rH  S 


I 


00  C 


CO 

0 

•H 

M 
0 

00 


I 


oj  oj  oj  tycoon  on 


0 
X3 


PQ 


o  c 


0 
•H 

0 
CO 


lt\    m-ct  N-^t  in  coco 


o 

CO 

0 

•H 

0 

CO 


in  c 
o  s 


on  oj 


OO'vD 


c 
o 


03 


+3 
C 
0 
•H 
O 
rH  vH 
0  In 
h  Cm 
U  0 
O  O 

o  a 


CT\  CT\  0s.  C^OAC^C^O^O^O^O^.O^OAO^.ONChOArH 

o^^r  cr*-=r        ow  o>-=t  o 

00  C«-  t-VO  VO  LP\  LT\~r  ^COin(\l(\lriHOOO 


l0l  I 


I    I     I     I  I 


I    I     I    I    I  I 


OOOOOOOOOOOOOOOOOO^ 

oinoL^oinomomoinoinoincai 
co  t«-  t^-uj  vo  ir\  irw  ^TOfnajaiHHOCON 

i 


22 

raw  score  for  the  nonselectees  was  subtracted  from  that  of 
the  selectees  for  the  same  rate  and  series;  these  difference 
scores  are  the  Difference  of  Mean  Scores  or  Differences  that 
are  dealt  with  in  Tables  7  and  8.    Each  arithmetic  mean  in 
Table  7  is  the  difference  between  mean  raw  scores  for  all  the 
ratings  of  a  single  paygrade  in  a  single  series  for  which  the 
difference  between  selectees  and  nonselectees  of  the  group 
mean  raw  scores  was  calculated.    Each  standard  deviation  is 
the  actual  standard  deviation  of  those  differences.  The 
mean  t  value  is  the  actual  mean  of  the  t  values  as  calculated 
individually.     The  weighted  aggregate  mean  of  the  difference 
scores  is  0.213.     The  aggregate  mean  of  the  t  values  is  1.03. 


Table  7 

Descriptive  Statistics  for  Differences  of  Mean  Scores 


Series  and 
Paygrade 

Number  of  Arithmetic 
Ratings  Mean 

Standard 
Deviation 

Mean  t 
Value 

A-8 

30 

.147 

.53 

.89 

A -9 

12 

.303 

.68 

1.22 

B-8 

30 

.203 

.72 

1.22 

B-9 

20 

.335 

.73 

.93 

C-8 

19 

.172 

.51 

.97 

C-9 

9 

.156 

.38  . 

.90 

In  Table 
ference  score 

8  the  11 
t  value  g 

competing  groups 
reater  than  1.96 

that  produced 
are  listed. 

a  dif- 

As  in 

u 

CD 

-P  (h 

CO  CD 

CD  -O 

H  m 

o  o 

0  CD 

O  O 

c  c 

CD  CD 

U  U 

CD  CD 

Cm  <m 

<m  <m 

•H  "H 

00  Q  O 


c 

•H 

CO 


CD 

-p 


^1 


QJ 
O 


CD 
Cm 
«m 
•H 
« 


CD 
•O 
CD 

>> 

CO 
0-, 


as 

■n 

0) 
CO 


C 
-p 

CO 


rH 

OA 


vo 

VD 


VO 


o 

VD 

CO 

CVJ 

CO 

CVJ 

OA 

CVJ 

H 

b- 

LfA 

CO 

O 

rH 

on 

oa 

CM 

in 

CVJ 

on 

OJ 

CVJ 

on 

rH 

on 

OJ 

CVJ 

a-. 

LTA 

o 
o 

t- 

CO 

o 

c . 

OA 

VD 
CO 

CA 
VD 

OA 

CVJ 

uA 
on 

CVJ 

CVJ 

iH 

H 

o 

o 

o 

o 
1 

r — J 

i 

rH 
1 

Ch     CTi      OA     CO      OA     00      CO      CO      OA     CO  CO 


pqo<pQpq  ofqo<copq 


OA 
CM 


cc 

CVJ 


-3- 


VD 

CVJ 


CO 
CVJ 


on 

rH 


on 

rH 


CO 


t^-  VD 


Table  7,  the  series  are  designated  A,  B,  and  C;  the  paygrades, 
simply  8  and  9.     The  ratings  are  designated  by  individual  code 
numbers  rather  than  by  their  actual  titles  or  abbreviations. 
The  competing  groups  were  listed  in  magnitude  of  difference 
order ;  the  group  with  the  largest  difference  is  at  the  top. 
In  addition  to  the  difference,  the  t  value  of  the  difference 
is  given  for  each  group.     A  negative  difference  indicates  that 
the  nonselectees  had  a  higher  mean  raw  score  on  the  Communi- 
cation subtest  than  the  selectees  of  the  particular  competing 
group.     A  negative  difference  is  the  opposite  of  the  hoped 
for;  in  other  words,  if  a  high  total  score  predicts  profi- 
ciency, then,  it  is  expected  that  a  high  score  on  a  subtest 
would  also  indicate  proficiency  and  likelihood  of  being  se- 
lected . 

Note  that  two  competing  groups  of  the  120  used  had  posi- 
tive differences  of  two  points,  or  more,  which  were  found  to 
be  statistically  significant  at  the  1%  level  of  confidence 
(actually,  it  is  the  .01%,  or  greater,  as  their  t  values  are 
markedly  greater  than  3.8).     This  gives  two  groups,  or  1.67$, 
with  a  real  difference  of  two  points  between  selectees  and 
nonselectees  on  a  20-item  test  (the  selectees  of  one  group 
had  a  mean  of  14.34;  the  other  had  18.33). 

There  are  four  other  groups  with  difference  t  values 
large  enough  to  be  significant  at  the  1%  level— two  of  these 
are  negative  and  two  are  positive.     The  algebraic  sum  of  the 
differences  for  these  four  is  +.13,   their  mean  is  +.0325.  It 
may  be  safe  to  say  that  these  four  cancel  one  another  and 


25 

raise  the  suspicion  that  they  may  be  due  to  chance  as  much  as 
to  a  real  difference. 

There  are  five  more  that  are  significant  at  the  5$  but 
not  at  the  1%  level;  of  these  five,  one  is  negative.     If  we 
continue  the  concept  that  a  negative  cancels  a  positive,  this 
gives  three  more  at  the  5$  level  of  confidence.     Add  these 
three  to  the  two  that  are  beyond  the  1%  level,  and  there  are 
five,  or  4.17$,  uncancelled  differences  that  are  of  signifi- 
cance at  the  5$  level  of  significance. 

Ignoring  the  negative  differences  and  including  those 
significant  at  the  1%  level,   there  are  six  groups  with  dif- 
ferences of  about  1  (actually  +.99)  or  greater,  which  is  sig- 
nificant at  the  5$  level.     Six  of  the  120  is  exactly  5$.  If 
all  positive  differences  are  included  there  are  8  that  are 
significant  at  the  5%  level;  8  is  6.67$  of  120.     These  dif- 
ferences are  not  large  and  are  very  likely  due  to  chance  and 
to  the  subtest's  low  positive  relationship  to  the  total  test 
score. 


CHAPTER  IV 
DISCUSSION 

Phase  I:     Correlation  Study 
As  stated  in  Chapter  III,  Table  4  is  a  frequency  dis- 
tribution of  the  correlation  coefficients  between  the  raw 
score  on  the  Communications  section  and  the  raw  score  on  the 
remainder  of  the  same  examination  for  237  competing  groups. 
While  the  range  of  these  coefficients  approximates  the  entire 
spectrum  of  correlation  coef f icients ,  there  is  a  clustering 
near  +.20,  rather  than  around  .00.     Of  the  total,  only  13.9$ 
of  the  coefficients  are  negative,  and  only  5.9$  are  positive 
with  a  magnitude  greater  than  .50.     These  deductions  leave 
80.2$  of  the  coefficients  falling  in  the  range  of  .00  through 
.499.     The  observed  median  of  the  237  correlation  coefficients 
is  +.194,  with  the  Q1  and        points  falling  at  approximately 
+.100  and  +.300,  respectively.     Prom  these  data,  it  seems  safe 
to  assume  that  the  typical  communications  test,  based  on 
Haney's  Uncritical  Inference  Test  (1955),  has  a  low,  positive 
correlation  with  the  remainder  of  the  advancement  examination. 

Table  5  is  a  distribution  of  the  coefficients  for  all 
E-8  groups  used  in  the  data  covered  in  Table  4,  and  Table  6 
is  a  comparable  distribution  for  E-9  groups.     Both  Table  5 
and  Table  6  are  further  broken  down  by  series.     It  can  be  ob- 
served that  the  median  values  fluctuate  around  the  overall 
median  of  +.194,  ranging  from  a  low  of  +.075  to  a  high  of  +.272. 


26 


27 

It  is  also  obvious  that  the  shapes  and  ranges  of  the  several 
distributions  are  different.     These  differences,  however, 
are  not  large,  and  the  description  "low  positive  correlation" 
applies  equally  well  to  all  six  "single  paygrade-single  series" 
groups . 

For  each  paygrade  and  series  combination  there  is  a  unique 
Communications  subtest;  this  accounts  for  much  of  the  varia- 
tion in  the  coefficients.     Also,  the  remainder  of  the  advance- 
ment examination  varies  some  from  series  to  series;  the  actual 
items  are  different,  but  the  items  of  a  given  type  are  similar, 
of  about  the  same  range  of  difficulty,  and  have  been  shown  to 
be  measuring  about  the  same  traits.     While  a  number  of  persons 
will  participate  in  both  of  a  pair  of  consecutive  series,  the 
populations  are  far  from  identical,  and  a  calendar  year  has 
expired  between  series.     These  differences  in  the  remainder 
of  the  examination  and  in  the  population  that  is  examined  un- 
doubtedly account  for  a  portion  of  the  variation. 

Prom  all  of  this,  it  is  probably  safe  to  state  that  there 
is  a  low,   positive  correlation  between  scores  on  the  Communi- 
cation subtest  and  scores  on  the  remainder  of  the  typical  ad- 
vancement examination  for  Senior  Chief  Petty  Officer  and  Master 
Chief  Petty  Officer.     A  correlation  of  the  magnitude  that  is 
representative  of  those  found  in  this  study  indicates  some 
communality,  but  also  indicates  that  there  is  more  independence 
than  communality.     In  all  likelihood,  the  Communication  subtest 
is  measuring  a  trait,  orw traits,  that  is  (are)  somewhat  inde- 
pendent of  the  traits  measured  by  the  rest  of  the  examination. 


28 

Further  evidence  that  the  Communication  subtest  may  be 
measuring  a  trait,  or  cluster  of  traits,  is  derived  from  the 
intrasection  reliability  data.     These  coefficients  of  reli- 
ability were  calculated  using  the  Kuder-Richardson  formula 
20  (1937).     The  median  values  of  these  coefficients  are  pre- 
sented in  Table  9. 

The  values  in  Table  9  are  larger  than  the  coefficients 
reported  in  Tables  5  and  6  and  are  of  a  magnitude  that  may 
indicate  a  predictable  trait.     These  reliability  values,  when 
derived  from  a  20-item  subtest  indicate  satisfactory  level  of 
reliability.     In  other  words,  performance  on  the  20-item  sub- 
test is  predicted  better  by  itself  than  by  the  remaining  130 
items  on  the  examination. 


Table  9 

Median  Values  of  Intrasection  Reliability  Based 
Upon  Competing  Groups  Used  in  Correlation  Study 


Series 

Paygrade 

Reliability 
(Median) 

A 

8 

.  61 

A 

9 

.625 

B 

8 

.725 

B 

9 

.69 

C 

8 

.56 

C 

9 

.56 

29 

Phase  II:     Discrimination  Study 
In  the  attempt  to  discover  whether  the  scores  on  the 
Communication  subtest  predicted  the  selection  board's  choices, 
the  difference  between  the  Communication  raw  score  of  a  rate's 
selectees  and  nonselectees  was  determined  for  each  rate  in- 
volved in  the  correlation  studies.     As  was  mentioned  in  the 
previous  chapter,  these  differences  were  summarized  in  Table  J. 

Looking  at  the  means  of  the  difference  scores  for  the  six 
groups  (a  group  consists  of  the  ratings  used  at  a  paygrade 
in  a  single  series),  the  mean  differences  ranged  from  +  .147 
to  +.335.     The  weighted  arithmetic  mean  of  these  differences 
.is  +.2126.     An  average  difference  of  less  than  one-fourth  of 
a  raw  score  point  on  a  20-item  subtest  cannot  be  used  as  a 
practical  or  reliable  means  of  predicting  group  membership. 
The  raw  scores  of  any  selectee  group  and  its  corresponding 
nonselectees  fall  into  frequency  distributions  with  large 
overlaps.    Even  at  the  extremes  of  the  two  distributions,  use 
of  this  raw  score  alone  to  predict  selection  or  nonselection 
would  result  in  a  few  false  predictions  for  each  group. 

The  statistical  stability  of  these  differences  was  tested 
by  dividing  each  mean  difference  by  the  standard  error  of  the 
difference  to  calculate  a  t  value.     For  the  six  groups  in 
Table  7,  it  is  seen  that  these  t  values  range  from  +.89  up 
to  +1.22.     The  median  of  these  t  values  is  +.95.     These  data 
indicate  that  the  two  groups  (selectees  and  nonselectees)  may 
really  be  one  group  and  that  these  small  differences  are  due 
to  chance  or  casual  differences.     Added  to  the  actual  overlap 


30 

of  the  data  of  the  two  groups,  these  _t  values  indicate  that 
the  raw  score  on  the  Communication  subtest  does  not,  by  it- 
self, predict  membership  in  the  groups  as  selected  by  the 
several  selection  boards. 

Other  Observations  and  Results 

During  the  process  of  entering  the  raw  data  for  the  cal- 
culation of  correlation  coefficients  and  the  descriptive  sta- 
tistics, and  during  the  process  of  hand  recording  the  results 
of  these  computations,  the  author  noticed  that  the  mean  Com- 
munication subtest  scores  of  various  ratings  differed  from 
one  another  in  a  rather  consistent  fashion.     More  specifically, 
it  was  noticed  that  some  ratings  had  much  higher  scores  than 
others  and  that  the  pattern  appeared  to  hold  true  for  both 
paygrade s  and  for  more  than  one  series.     Often,  these  inter- 
rating  differences  were  considerably  larger  than  the  typical 
intrarating  difference  between  selectees  and  nonselec tees . 
If  these  differences  between  the  specialties  (ratings)  could 
be  shown  as  real  and  consistent,  there  would  be  further  evi- 
dence that  the  subtest  was  measuring  a  trait. 

In  an  attempt  to  determine  whether  a  rating's  score  on 
the  subtest  in  one  series  and  paygrade  would  predict  that 
rating's  score  in  a  different  series  and/or  paygrade,  a  series 
of  correlation  coefficients  were  computed.     These  coefficients 
were  computed  on  the  Monroe  326,  using  Monroe's  program  num- 
ber 9211W  (Linear  Regression).     The  several  correlations  and 
the  data  upon  which  each  was  based  are  delineated  in  Tables 
10,  11,  and  12. 


The  letter-number-letter  codes  in  the  X  data  and  Y  data 
columns  of  Tables  10,  11,  and  12  stand  for  the  series,  pay- 
grade,  and  selectee-nonselectee  status  of  the  group.  B-8-S 
indicates  series  B,  paygrade  E-8  and  selectees;  A-9-N  means 

series  A,   paygrade  E-9  and  nonselectees .     The  r      lists  the 

xy 

Pearson  product-moment  correlations  between  the  indicated  X 
and  Y  of  specialties  or  ratings. 

Of  the  29  coefficients  listed  in  Tables  10,  11,  and  12, 
5  are  greater  than  +.90  and  23  are  greater  than  +.75.  Before 
discussing  any  of  the  implications  of  these  large  coefficients 
of  correlations,  another  fact  must  be  interrelated — series  A 
is  the  most  recent  examination  cycle,  and  series  C  the  earliest 
Personnel  who  were  not  selected  at  series  C  may  have  been 
reexamined  in  series  B,  and,  in  turn,   the  nonselectees  of 
series  B  may  have  been  reexamined  in  series  A.     In  actual 
practice,  at  least  half  of  the  nonselectees  participate  in 
the  next  series  of  examinations. 

With  these  facts  in  mind,  observe  that  four  of  the  five 
coefficients  greater  than  +.90  are  between  successive  series 
groups  at  the  same  paygrade  with  a  high  probability  of  large 
overlap  in  actual  participants.  Also,  the  three  coefficients 
that  are  below  +.70  in  value  are  between  groups  with  no  par- 
ticipants in  common.  These  could  be  interpreted  to  indicate 
that  there  is  evidence  for  rather  high  test-retest  reliability 
with  a  reasonably  equivalent  form  used  for  the  re  test. 

Of  the  22  coefficients  between  populations  with  no  mem- 
bers in  common,  the  highest  was  +.923,  the  lowest  +.527,  and 


32 


Table  10 

The  Correlation  Coefficients  Between  the  Scores 
Achieved  by  the  Several  Specialties  in  the 
Various  Groups  of  E-8  Candidates 


X  Data 

Y  Data 

rxy 

C-8-N 

B-8-S 

+  .776 

C-8-N 

B-8-N 

+.972. 

C-8-N 

A-8-S 

+  .819 

C-8-N 

A-8-N 

+  .840 

C-8-N 

C-8-S 

+  .783 

C-8-S 

B-8-S 

+  .681 

C-8-S 

B-8-N 

+  .768 

C-8-S 

A-8-S 

+  .727 

C-8-S 

A-8-N 

+  .692 

B-8-N 

A-8-S 

+  .901 

B-8-N 

A-8-N 

+  .905 

B-8-N 

B-8-S 

+  .763 

B-8-S 

A-8-S 

+  .803 

B-8-S 

A-8-N 

+  .724 

A-8-S 

A-8-N 

+  .786 

Table  11 

Correlation  Coefficients  Between  Scores  Achieved 
by  Ratings  in  the  Groups  of  E-9  Candidates 
from  Series  A  and  B 


X  Data 

Y  Data 

rxy 

B-9-S 

A-9-S 

+  .844 

B-9-S 

A-9-N 

+  .812 

B-9-S 

B-9-N 

+  .756 

B-9-N 

A-9-S 

+  .898 

B-9-N 

A-9-N 

+  .948 

A-9-S 

A-9-N 

+  .893 

Table  12 

Correlation  Coefficients  Between  Scores  Achieved 
by  Ratings  in  the  Groups,  Across  Paygrades, 
Using  Candidates  in  Series  A  and  B 


X  Data  Y  Data  r 


B-8-S 

B-9-S 

+  .527 

B-8-S 

B-9-N 

+  .771 

B-8-S 

A-9-S 

+  .816 

B-8-S 

A-9-N 

+  .837 

B-8-N 

B-9-N 

+  .878 

B-8-N 

B-9-S 

+  .721 

B-8-N 

A-9-S 

+  .862 

B-8-N 

9  A-9-N 

+  .923 

and  the  median  +.78.     These  data  tend  to  show  the  possible 
existence  of  a  rating-specific  trait  that  is  being  measured 
by  the  Communication  subtest. 


CHAPTER  V 
RECOMMENDATIONS  FOR  FUTURE  STUDY 

After  reviewing  the  results  of  these  studies,   some  prob- 
lems and  unanswered  questions  appeared.     Some  ways  of  reliev- 
ing the  problems  and  possibly  finding  answers  to  some  of  the 
questions  are  suggested. 

1.  Standardization  of  the  subtest:     As  these  several 
tests  that  have  the  superficial  appearance  of  equivalency 
vary  in  difficulty  and  internal  consistency,  it  would  appear 
that  some  of  them  can  be  revised  in  an  attempt  to  make  the 
several  of  them  more  nearly  equivalent.     If  a  dozen  forms 
were  developed  and  shuffled  about  randomly  among  the  series, 
the  psychometric  discrimination  of  the  subtest  should  be  im- 
proved. 

2.  A  complete  study  of  the  rating-specific  qualities 
of  the  subtest,  using  all  ratings  in  several  series. 

3.  If  recommendation  2,  above,  confirms  the  rating- 
specific  trait,  investigate  whether  this  trait  develops  during 
the  years  that  these  candidates  have  worked  in  these  special- 
ties (ratings)  or  exists  before  the  individual  has  worked  at 
the  specialty. 

4.  If  the  traits  appear  to  be  preexistent,  investigate 
the  possible  use  of  such  an  instrument  as  a  differential  se- 
lection device. 


35 


5.     As  the  studies  pursued  in  the  investigation  of  the 
original  problem  gave  little  evidence  of  validity,  develop 
other  means  of  probing  the  validity  of  the  subtest  as  an 
advancement  predictor. 


REFERENCES 

Bureau  of  Naval  Personnel.     Manual  of  Qualifications  for 

Advancement  in  Rating  (NAVPERS  lBo6b  Series) .  Washington, 
D.C.  :     United  States  Navy,  current  series. 

Cooperative  Reading  Comprehension  Test  (Reading  Comprehension: 
Cooperative  English  Test ) .     Princeton,  N.  J. :  Cooperative 
Test  Division,  Educational  Testing  Service,  1940. 

Cooperative  Reading  Comprehension  Test  (Reading  Comprehension: 
Cooperative  English  Test) .     Princeton,  N.  J. :  Cooperative 
Test  Division,  Educational  Testing  Service,  1951. 

Dow,  A.  N.,  &  Macaluso,  C.  J.     The  new  E-8  and  E-9  exams — 
A  first  year  report.     Proceedings  of  the  12th  Annual 
Conference,  Military  Testing  Association,   1970,  99-104. 

DuBois,   P.  H.     A  test-dominated  society:     China,  1115  B.C.  - 
1905  A.D.     In  A.  Anastasi  (Ed.),  Testing  Problems  in 
Perspective .     Washington,  D.C.:     American  Council  on 
Education,  I965. 

Haney,  W.  V.     Measurement  of  the  ability  to  discriminate' 
between  inferential  and  descriptive  statements  (Doctoral 
dissertation,  Northwestern  University,  1953).  Dissertation 
Abstracts,  1954,  14,  No.  7037. 

Haney,  W.  V.     The  Uncritical  Inference  Test.    Wilmette,   111. : 
William  V.  Haney  Associates,  1955. 

Haney,  W.  V.     Police  experience  and  unconscious  inference 
behavior,  General  Semantics  Bulletin,  22  &  23,  1958. 

Jenckes,  T.  A.     Civil  Service  of  the  United  States  (Report 
No.  47).     Washington,  D.C.:     40th  Congress,  2nd  Session, 
May  1868. 

Kuder,  G.  F. ,  &  Richardson,  M.  W.     The  theory  of  the  estimation 
of  test  reliability.     Psychometrika ,  2,  1937,  151-160. 

Macaluso,  C.  J.,  &  Dow,  A.  N.     E-8  and  E-9:     A  new  approach. 

Proceedings  of  the  11th  Annual  Conference,  Military  Testing 
Association,  19^97  TS^lET. 

o 


37 


38 

Monroe,  the  Calculator  Company.    Beta  326  Scientist  (Manual 
of  directions  for  the  326  Scientist  programmable  calcu- 
lator).    Orange,  N.  J.:     Monroe,   the  Calculator  Company 
(Litton  Industries),  no  date. 

Monroe,  the  Calculator  Company.     Beta  326  Regression  Analysis 
Pak.     Orange,  N.  J.  :     Monroe,  the  Calculator  Company1 
( Litton  Industries),  no  date. 

Ohio  State  University  Psychological  Test .     Chicago:  Science 
Research  Associates,  1919-1947. 

United  States  Navy.     The  Bluejackets'  Manual  (9th  ed.). 
Annapolis,  Md.:     United  States  Naval  Institute,  1939. 

United  States  Navy.     The  Bluejackets'  Manual  (l4th  ed.). 

Annapolis,  Md.:     United  States  Naval  Institute,  1950. 

United  States  Navy.     The  Bluejackets'  Manual  (19th  ed.). 
Annapolis,  Md. :     Naval  Institute  Press,  1973. 

Van  der  Veer,  N.  R.     The  Bluejacket ' s  Manual,  United  States 
Navy.     New  York:     Military  Publishing  Co.,  1916. 

Watson,  G. ,  &  Glaser,  E.  M.     Watson-Glaser  Tests  of  Critical 
Thinking .     New  York:    World  Book  Co., 


■ 


BIOGRAPHICAL  SKETCH 
Andrew  N.  Dow  was  born  in  Jacksonville,  Florida,  where 
he  attended  the  public  schools,  graduating  from  Andrew  Jackson 
High  School.    His  entire  undergraduate  career  was  at  the 
University  of  Florida.     The  major  field  for  his  B . A .  and  M.A. 
degrees  was  psychology.     In  addition  to  the  University  of 
Florida,  he  did  a  year  of  graduate  study  at  the  University  of 
Minnesota. 

His  professional  career  has  included  work  at  the  Florida 
Industrial  School  at  Marianna,  Florida  State  Board  of  Health, 
Farragut  College  (Idaho),  General  Extension  Division  of 
Florida,  Miami-Dade  Junior  College,  and  various  aspects  of 
the  U.S.  Navy's  training  and  personnel  programs,  both  in  uni- 
form and  as  a  civilian  employee.     He  is  currently  employed 
in  the  Evaluation  and  Analysis  Division  of  the  Naval  Educa- 
tion and  Training  Program  Development  Center,  Pensacola, 
Florida. 

He  is  married  to  the  former  Grace  Ullman  who  was  a  staff 
member  of  the  College  of  Medicine  at  the  University  of  Florida. 
They  have  one  daughter,  Rebecca  Grace,  who  attends  public 
school  in  Pensacola. 


39 


I  certify  that  I  have  read  this  study  and  that  in  my 
opinion  it  conforms  to  acceptable  standards  of  scholarly  pres- 
entation and  is  fully  adequate,  in  scope  and  quality,  as  a 
dissertation  for  the  degree  of  Doctor  of  Education. 

Vytfce  A.  Hines 


Professor  of  Foundations  of  Education 


I  certify  that  I  have  read  this  study  and  that  in  my 
opinion  it  conforms  to  acceptable  standards  of  scholarly  pres. 
entation  and  is  fully  adequate,  in  scope  and  quality,  as  a 
dissertation  for  the  degree  of  Doctor  of  Education 


Richard  J.  Anderson 
Professor^  of  Psychology 

I  certify  that  I  have  read  this  study  and  that  in  my 
opinion  it  conforms  to  acceptable  standards  of  scholarly  pres- 
entation and  is  fully  adequate,  in  scope  and  quality,  as  a 
dissertation  for  the  degree  of  Doctor  of  Education 


lob e r t  B  .  Myers/ 


Robert  B.  My 
Professor  Genets'!  Teacher  Education 


This  dissertation  was  submitted  to  the  Graduate  Faculty  of 
the  College  of  Education  and  to  the  Graduate  Council,  and  was 
accepted  as  partial  fulfillment  of  the  requirements  for  the 
degree  of  Doctor  of  Education. 

August  1977 


of  Education 


Dean,  Graduate  School 


